Model Selection

Efficient pre-training

# Efficient pre-training

Olmo2 11B SuperBPE T180k

An 11-billion parameter large language model trained with the innovative SuperBPE tokenizer, supporting superword unit recognition and subword tokenization capabilities.

Large Language Model

Transformers English

Gte Multilingual Mlm Base

mGTE series multilingual text encoder, supporting 75 languages, with a maximum context length of 8192, based on BERT+RoPE+GLU architecture, excelling in GLUE and XTREME-R benchmarks

Large Language Model

Ltg Bert Babylm

A BERT variant trained on the 100MW BabyLM Challenge dataset, optimized for performance on medium-scale corpora

Large Language Model

Transformers English

M2 Bert 80M 2k Retrieval

This is an 80M parameter M2-BERT pre-trained checkpoint with a sequence length of 2048, fine-tuned for long-context retrieval tasks.

Transformers English

togethercomputer

Retromae Small Cs

A BERT-small model pre-trained on Czech web corpora using the RetroMAE objective, developed by Seznam.cz, suitable for various natural language processing tasks.

Transformers Other

Sheared LLaMA 1.3B

Sheared-LLaMA-1.3B is an efficient language model obtained through structured pruning and continual pre-training based on LLaMA-2-7B

Large Language Model

Efficient Mlm M0.15

This model investigates the effectiveness of masking 15% of content in masked language modeling, employing a pre-layer normalization approach.

Large Language Model

Distilbert Mlm 750k

DistilBERT is a lightweight distilled version of BERT, retaining most of the performance but with fewer parameters.

Large Language Model

vocab-transformers

Rugpt3small Based On Gpt2

Russian pre-trained Transformer language model developed by SberDevices team, based on GPT2 architecture, supports 1024 sequence length, trained on 80 billion tokens.

Large Language Model Other

Roberta Base Wechsel Swahili

A RoBERTa base model trained using the WECHSEL method, specifically optimized for Swahili to achieve efficient cross-lingual transfer.

Large Language Model

Transformers Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase